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1 ^ Progress of the Ph,D> Student Supported by this Grant . 

Mr. Bombran Shetty, a Ph.D. student in the Industrial Engineering and 
Management Sciences Department at Northwestern University was supported 
(tuition and stipend) by this grant. He was unable to make significant 
progress toward the solution of the problem addressed in the research pro- 
posal, although he spent a great deal of time on it and worked very dili- 
gently. Finally, in March 1974, he was forced to drop out of school for 
personal reasons. This unfortunate series of circumstances caused con- 
siderable delay in the progress of the research. 

2. Sequential Decision Analysis . 

The principal investigator began to work full time on the grant after 
file departure of Mr. Shetty. It was felt that in order to develop. sn adap- 
tive estimator for processes in which the mean and variance of the obser- 
vation noise are unknown and may be changing in time, a procedure must be de 
veloped for making sequential decisions on non- stationary stochastic process 
Current statistical decision theory deals only with time independent ran- 
dom variables, and the results of optimal stochastic control theory, which 
do deal with the above problem, are usually not amenable to actual algo- 
rithmic implementation. Research toward development of such a procedure 
produced some independently interesting results, and are contained in the 
accompanying paper. These results also solve a major portion of the prob- 
lem addressed in the research proposal. This paper is being submitted for 
publication to the Institute of Mathematical Statistics, and will be pre- 
.sented at the 1975 ORSA meeting in Chicago. 
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3. Continuing Work . 

The principal investigator is continuing research on the problems 
addressed in the research proposal for this grant. One paper on model 
evaluation is currently being revised and the use of the above sequential 
decision algorithm in adaptive Kalman filtering is being considered. In 
the Kalman filter, the losses incurred for using an incorrect model are 
well-known, and these will be used as the loss function in the decision 
algorithm. Results of this research will be forwarded to NASA, as an 
addendum to this final report, when they are completed. 



SEQUENTIAL DECISION ANALYSIS 
FOR 

* 

NON -STATIONARY STOCHASTIC PROCESSES 


by 

Brian Schaefer 
Northwestern University 
Evanston, Illinois 

September 1974 
Abstract 

A formulation of the problem of making decisions concerning the state 
of non-s tat ionary stochastic processes is given. An optimal decision rule, 
for the case in which the stochastic process is independent of the decisions 
made, is derived ♦ It is shown that this rule is a generalization of the 
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ratio test is given, in which the optimal thresholds may vary with time. 
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I. IntToduc t Ion 


The general framework of the sequential decision problem has 
remained in essentially the same form as originally formulated by Wald 
[1947]. This formulation involved a sequential decision concerning the 
choice of obtaining another sample or making a final decision. 

In this paper we generalize this problem to include making de- 
cisions on the state of a non-stationary stochastic process and are able 
to obtain a convenient solution for the case in which the state of the 
process is independent of the decisions made. 

Such a formulation is of interest in problems involving estimation 
or signal detection as used in the tracking of missiles or commercial aircraft. 
In thjese problems, the decisions usually do not influence the original 
prcccsc. Another area in whicii umua jioririulation is ^ppruprxaUe would be 
problems involving economic decisions where the processes under ob- 
servation, such as stock prices or government indices, are relatively 
independent of decisions made on a personal or corporate level. 

II. Sequential Decisions for Stochastic Processes 

We will let T be a linearly ordered parameter set, and will assume 
that z(t), t € T is a stochastic process defined on a probability 
space (0,3,7?). We will also let y(t), t € T be a stochastic process defined 
on a family of probability spaces 6 € ©} . 

y 

The set of admissible actions will be given as a measurable space 

(7, A). An action process, a(t), t € T, will be defined, on <7; such a process 

is similar to a stochastic process without the probability measure, that is, 

T 

a: <7 -- R . 

A measurable loss function L: © x (7 ^ R will be defined as will a set, 
D, of measurable decision functions, where d £ D and d: 0 7. 
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The function d will be required to operate 


on 0 only as a causal function 


of y(t), t € T; that is, for any t € T 

a(T) = d(y(t), t ^ T) 

The Bayesian decision problem consists then in finding a d* 6 D such that 
the risk function r(d) = E[L(0,d(w))l is minimized 

M) r(d*) = min i L(9,d(w))dPQ(w)d7?(9) 

d € D “0 


The complexity of the above minimization problem is determined by the 
nature of the probability spaces 0 and Q, the loss function L, and the 
decision set D. The loss function L will be defined on © through the 
process z(t), t € T, so that L(9,d(w)) = L(z(t), t € T; d(y(t),t € T))- 
If z(t), t € T, is a function of d(y(t), t € T) then the minimization in 
(1) is a problem usually studied in stochastic optimal control theory, see 

Kushner'C 19671. In the particular case wheu i(T L) is a functioi 
only of z(T) and a(T> the problem is usually referred to as a Markov decision 


problem. 

Although in the study of stochastic control theory and Markov decision 
processes it is possible to obtain necessary conditions for the optimal de- 
cision rule in some cases, these conditions often do not lead to a practical 
explicit solution. In the next section we will make several assumptions that 
will lead to a simple explicit solution to (1). These assumptions will 
usually be true in the case where the problem involves decisions concerning 
the state of the z(t), t € T process, rather than the control of this process. 
In Wald's original formulation the process z(t), t € T is a constant z(t) = 
r, or a for all t G T, and the loss function is a constant until a decision 
is made, representing the cost of observations, and zero after the decision 
has been made. Although Wald did not adopt the Bayesian context, his results 
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would be unaltered if equal a priori probabilities were assumed. In the 
following section we will derive an interesting analog to Wald's results. 


III. Non-Controlled Processes and Independent Action Processes 

In this section we will assume that z(t), t € T is not a function 
of a(t), t € T and will consider the following form of the loss function 

L(9,a) = f L(z(t), a(t)) dt 

( 2 ) 

where L(z(t),a(t)) ^0 V t € T 
= [ L(z(t), d(y(x), T s: t))dt 

^ m 


The notation f(z(t)) will imply that f(*) may be a function of t as well as 
the value of z at t; that is, f(z(t)) = f(z(t),t). Assuming that the following 
integrals exist, the risk function, (1), becomes, with some abuse of 
notation. 


r(d) = 


f r f L(z(t), d(y(T), 


'0 T 


T s t)) dt dPg(w)d7?(9) 



] L(z(t),d(y(T),T S t)) d7?(e|w)dP(w)dt 

••n *^9 


1 r 


L(z(t),d(y(T),T s t))d7?(z(t) iy(T),T ^ t)dF(y(T),T^ t)dt 


We will let i7(t) represent the set of admissible actions, 
at time t, and we will make the assumption that <7(t) is independent of a(T), 
t ^ T for any t, t € T. That is, the set of admissible actions at any given 
time does not depend on an action taken at any other time. Such processes 
a(t), t € T, we will call independent action processes 
If we let 

^(a,y,t) = r L(z(t),d(y(T), T s t) = a)d??(z(t)ly(T),T s t), a€ <7(t) 


(3) 
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then the above risk function becomes 


(4) 


r(d) = ; I . =i(a,y:,t)dP(y(T),T s t)dt 

•"t 


Theorem 

If:a(t), t € T is an independent action process; z(t), t € T does 

not depend on a(t), t € T; and the loss function is of the form given in (2), 

then the risk (1) is minimized by the following decision rule: 

d(y(T), T s t) ® a* a* € C7(t) 

iff 

(5) j£(a*,y,t) ^ a?(a,y,t) Va€i7(t) 


Proof 

Since L(z(t), d(y(T>, t S t) ^ 0 Vt € T, from (3) we have 
a£(a,y,t) :a 0 for all a €c7(t), t € T, and all y(r), t s t. Thus (4) will 
be minimized by choosing the a € <7it) that minimizes s£(a,y,t) for each t € T 
auu each y(i), t. Given (4), -this last statement, “may be proved simply 

by contradiction. I 1 


We will also define 

(6) Q(a,y,t) = J L(zft) ,d(y(T) ,T S t)P(y(T),T s 1 1 z (t))d??(z (t ) ) 

R 

then 

Corollary 

Given the assumptions of the above theorem, the risk (1) is 
minimized by the decision rule 

d(y(T), T s t) = a* a* € d7(t) 

iff 

(7) Q(a*,y,t) ^ Q(a,y,t) ¥a € 
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Proof 

From (3) and (6) Q(a,y,t) = ^(a.y ,t)P(y(T) , t s t) 
and therefore 

Q(a*,y,t) s Q(a,y,t) 

iff s£(a*,y,t) ^ =t(a,y,t) Q 


a) Estimation Given a Quadratic Loss Function 
If 


L(z(t),d(y(T), T ^ t)) 

= (z(t) - d(y(T),T ^ t))^ 

then the optimal decision rule as determined from (3) and (5) is 
(8) d(y(T), T ^ t) = E[z(t)l y(T),T ^ t]. 


This result is given in Doob [ 19531. 


The conditions required of y(t) 
•filter, y.alman [l960l, arc precisely thos 


and z(t), t € T by the Kalman 
c such that (8) can ha computed 


recursively 


in time. 


b) Finite State and Action Spaces 
Suppose 


z(t) € i = 1» 

and a(t) € ^ 


n] V t € T 

ml V t € T 


and that the observation process y(t), t € T is some process related to 
z(t), t € T, as depicted in Figure 1. 
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a(t) is a function of y(r), t s t, and we will define 

L(z(t) = a^»d(y(T), T S t) = p^) = s 0. 

In a practical situation we might be trying to determine the state of z(t) 
from noisy data, and the loss would be minimized if a(t) = z(t), t 6 T. 

The above Theorem gives the following optimal decision rule. 


a(t) = 



j ~ 1» • • • j 


(i) Likelihood Ratio Test 

It is well known that for simple binary random variables and fixed 
sample size, both the optimal Bayes test and the Ne)mian -Pearson test result 
in comparing the likelihood ratio with a simple threshold. If z(t) 6 
and a<t) € (aj^,a 2 } V t € T and assuming that ^ L^j(t), i = 1,2, 

then from (7) the optimal decision rule is shown to be 

I’(y(T),T S tlz(t) = ttj) [L^j^Ct) - L22(t)]7?(z(t) = a 2 > 

a(t) = iff p(y(T),T ^ tjz(t) = a 2 ) ^ 

a(t) = a 2 otherwise. 

Thus if z(t) is constant in time, and the loss function is independent 
of time, the above decision rule reduces to the familiar likelihood ratio 
test. On the other hand, the result shows that for a stochastic process 
the optimal decision rule consists of a time varying likelihood ratio 
test. 


(ii) Analog to Wald’s Original Formulation 

In Wald's original formulation, he considered processes which were 
in one of two states for all t € T; that is, the process was a simple 
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binary random variable. This will be a special case of the following, in 
which we will assume that the process may be in either of two states at any 
time t € T. 

z(t) € V t € T 

We will let our action space be such that 

a(t) € V t € T 

where : 

a(t) = corresponds to the decision that z(t) = 

a(t) - a 2 corresponds to the decision that z(t) = 

and, a(t) = corresponds to the action of making no decision at this time, 

other than to wait until we have obtained another observation* We will 
specify a loss function that requires us to pay for this additional informa- 
tion and waiting time* We will let 



L (t) < L (t) < L,.,(t) 
11 1^ 

V f f T 

and 

L22<t) < L23(t) < L2 3(t) 

V t € T. 


That is, the loss incurred for making a correct decision is less than the 
loss for making no decision, which is less than the loss for making the 
wrong decision. For simplicity we will normalize the losses such that 

Lii(t) = L22(t) = 0. 

If we assume that P(z(t) = a^) = P(z(t) = a 2 ) V t 6 T then the decision 

rule determined by (7) consists of computing the likelihood ratio 

^(y("), T ^ tiz(t) == a^) 

" 7?(y(T), T s tlz(t) = a 2 > 


a(t) = if A(y,t) > 


Li^Ct) 


and setting 
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a(t) = QL2 if A(y,t) < 

a(t) = otherwise. 

If the loss function is time independent, then the similarity 
of this to Wald's result of constant thresholds is striking. In particular 
if we let L ^2 ~ ^21 ^23 ~ Probability of a Type I error. 



= Probability of a Type II error. 


then these are exactly the Wald thresholds. 

If we wish, however, we can make increasing 

functions of time, which correspond to making the loss incurred by in- 
decision greater the longer a decision is delayed. This 
would b-iiug the above thxesholus cioi»ei together as showu In Figure 2, 
although we note that the inequality 

L^^Ct) > h^3<t) + h23(t) Vt€T 

must be satisfied. 

This inequality is a statement of the fact that if the total cost of 
indecision is greater than the cost of a wrong decision, then a decision should 
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A major difference between this test and Wald's test is, that 
the test continues for a fixed time interval, T. Once a threshold is 
crossed, the test does not stop, rather the action is constant until the 
same threshold is recrossed in the opposite direction, 

IV, Cone lus ion 

This paper has considered the problem of making decisions on 
the state of a stochastic process, A solution to the problem has been 
derived from the case in which the state of the process is independent 
of the decisions made, the set of admissible actions at any time is in- 
dependent of the action taken at any other time, and the loss function 
is of the form given in (2), These assianptions are usually made im- 
plicitly in the derivation of the conditional expectation as the solution 
to the minimum variance estimator, and this solution is shown to 
toiiow from the decision rule derived in this paper. The above assumptions 
are also true in the formulation of the standard Bayesian likelihood ratio 
test, and the Neyman -Pearson test, and this paper therefore, is a generation 
of these tests. 

In the Wald test, the assumption of the set of admissible actions 
being independent of actions taken at any other time, is not true, since 
once a threshold is crossed no more observations may be taken. Often, 
however, one would like to formulate a statistical test, in a sequential 
manner, and continue to accept observation after one hypothesis has tentatively 
been accepted. This would be particularly true if the hypothesis that was 

true, could change over time. The solution to such a formulation is given 
in this paper • 
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